ADVERSARIAL SMOOTHED ANALYSIS 



FELIPE CUCKERt RAPHAEL HAUSER*AND MARTIN LOTZ* 

Abstract. The purpose of this note is to extend the results on uniform smoothed analysis of 
condition numbers from [l] to the case where the perturbation follows a radially symmetric proba- 
bility distribution. In particular, we will show that the bounds derived in [T] still hold in the case 
of distributions whose density has a singularity at the center of the perturbation, which we call 
adversarial. 
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1. Introduction. Condition numbers play a central role in numerical analysis. 
They occur in error analysis for finite-precision algorithms (this being historically the 
reason for their introduction in the late 1940's by von Neumann and Goldstine [TU] and 
Turing 9J) as well as a parameter in expressions bounding the number of iterations in a 
variety of algorithms (a paradigmatic example being the conjugate gradient method [5J 
Theorem 38.5]). In practice, however, a difficulty appears: it would seem that to know 
the condition number of a given data one needs to solve the problem at hand on this 
data. An inconvenient circularity. A way out of it, proposed by Steve Smale (see [5] 
for a review), is to assume a probability measure on the space of data and to study 
the condition number c (§(a) at data a as a random variable. In other words, to study 
the condition number of random data. 

In doing so Demmel [2] noticed that most condition numbers could be written 
as (or at least reasonably sharply bounded by) the relativized inverse of the distance 
from the data a S ]R n+1 to a set of ill-posed instances £ C R" +1 . That is, one could 
write 

^ ^ dist(a, E) ^ 

The simplest example of this phenomenon is given by the condition number for matrix 
inversion and linear equation solving. For a non-singular square matrix A it takes the 
form k(A) := \\A\\ || where || || denotes the operator norm. The Condition 
Number Theorem by Eckart and Young states that = d(A, S)" 1 , where £ is 

the set of singular matrices. 

In most applications, £ is a pointed cone. Therefore, one could normalize so that 
a belongs to the n-dimensional unit sphere S n . Note that the usual assumption that 
a has a Gaussian distribution in R n+1 yields a uniform distribution in S n after this 
normalization. It is for condition numbers as in (jl.ip — which we shall call conic — 
with inputs drawn from the uniform distribution on S n that Demmel proved in [3] 
(shortly after [2]) a general result bounding their tail as a function of n and the degree 
of an algebraic hypersurface containing E. 

Very recently, a new paradigm for probabilistic analysis was proposed by Spielman 
and Teng [SJ [7] . Called smoothed analysis, it consists of replacing the idea of "random 
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data" by that of "random perturbation of a given data" and study the worst-case 
(w.r.t. data a) of the latter. In its original formulation, and in the case of a condition 
number c <o(a), this amounts to study the tail 

sup Prob {*tf(z) > t} 

or the expected value 

sup E [In^O)] 

a6 M" + ! z<EN(a.a 2 ) 

where N(a,<r 2 ) is a Gaussian distribution centered at a with covariance matrix c 2 Id 
and a 2 small (with respect to ||a||). In [T], to obtain general results as in [3J, data was 
again restricted to S n and the expressions above replaced by 

sup Prob {tf(z) > t} 

a£S n z£B(a,<r) 

and 

sup E [ln^(z)] 

a£S n z£B(a,<r) 

where B(a,a) is the open ball (that is, the spherical cap) in S n centered at a and of 
radius a, and z is drawn from a uniform distribution on this ball. 

One of the claimed advantages of smoothed analysis is a smaller dependence 
on the underlying distribution. It follows from this claim that the replacement of 
Gaussian perturbations by uniform ones should not significantly affect the smoothed 
analysis of ^(a). The goal of this note is to further pursue this claim by extending 
the main result in [I], combining it with ideas from [4], to a class of distributions we 
call adversarial. The support of such a distribution is, as in the uniform case, the ball 
B(a,a) and they are radially symmetric as well. But their density increases when 
approaching a and has a pole at a. 

2. Preliminaries. We assume our data space is K™ +1 , endowed with a scalar 
product ( , ) . In all that follows we consider problems whose set of ill-posed inputs S 
is a point-symmetric cone in R™ +1 . That is, if x € £ then Xx 6 £ for all A e 1. By a 
conic condition number we understand a function c € : R™ +1 — > [l,oo] such that for all 
a e W l+1 we have 

na) = dist(a,E)' 

where || || and dist are the norm and distance induced by ( , ). Note that for A ^ 
we have ^(Aa) = ^(a). We can therefore work with the n-dimensional real projective 
space P™ as ambient space. If we also denote by S C P™ the image of the ill-posed 
cone in projective space, then for a £ P n it follows that 

dp(a, £) 



where dp{x,y) = sin a, denotes the projective distance between x,y € P" (a being 
the angle between x and y). 
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The two- fold covering p: S n — > P™ induces a measure v on P™ by means of 
v(B) := ^Vo\ n (p~ 1 (B)) for B C P™, where Vol ra is the n-dimensional volume on the 



sphere. Thus z/(P n ) = <?„/2, where ^ n := Vol n (5 n ) = ^fX~ v 

For < cr < 1 we denote by -Bp(a, cr) the open ball of projective radius a around 
a € P n . It is known that 

z/(B P (a,cr)) = • I„(cr), 

where 



/„(a) := / -== dr. (2.1) 
Jo vl 

The following bounds will prove useful on several occasions: 

— < /n(0") < mm {yf=^f ' Y T j • — ■ (^) 

For a £ P" and cr € (0, 1] the uniform measure on Bp (a, cr) is defined by 

1 /(BnB,(q,g)) 

V a ,a{B) - 75-7 77 (2.d) 

for all Borel-measurable BCP™. 

2.1. Uniform smoothed analysis. A reformulation of the main result in [T] 
in the projective space setting can be written as follows. 

Theorem 2.1. Let ^ be a conic condition number with set of ill-posed inputs 
S C P". Assume that E is contained in the zero set in¥ n of homogeneous polynomials 
of degree at most d. Then, for all a G (0, 1] and all t > to = (2d + 1)— , 

sup Prob {<£(£) > t} < 13 dn— . 

agpn zGB F (a,a) at 



sup E [ln^O)] < 21nn + 21nd + 21n- + 5, 

a£P" z£B e (a,a) 0~ 

where Prob and E are taken with respect to v a ^ a . 

As a consequence of this result, uniform smoothed analysis results for the condi- 
tion numbers of a variety of problems are obtained, including linear equation solving, 
Moore-Penrose inversion, eigenvalue computation and polynomial system solving. The 
bounds obtained are consistently of the same order of magnitude as the best bounds 
obtained previously by ad-hoc methods. 

2.2. Uniformly Absolutely Continuous Distributions. In [1] a general boost- 
ing mechanism was developed that allows extending any probabilistic analysis of a 
condition number with respect to some chosen probability distribution over the input 
data to a more general class of distributions. 

Let \x be a i/ ai(T -absolutely continuous probability measure. Using the convention 
ln(0) := -co we define, for 5 e (0, 1), 

inf(<5) := inf < ^ ^ : B is Borcl-measurable and < v a o(B) < S 

I inv a ^(B) 



With these conventions, Theorem 2.2 of [I] shows that 



a va,Ap) : = liminf(J) G [0,1]. 



o 



(2.4) 



Absolute continuity alone ensures that all ^ aj(T -null-sets must be /i-null-sets, but this 
does not imply that n(B) is small when v a G (B) is small and strictly positive. In 
contrast, when a Ua ^(p) > then (|2.4|) gives uniform upper bounds on /j-(B) in terms 
of v a a {B). Furthermore, the smaller a gets, the larger the variation of /i in terms of 
v a ,a- If M is ^a,o-absolutely continuous and a„ aa (p) > 0, we therefore say that /i is 
uniformly ^ a o --absolutely continuous and call a Va a (p) the smoothness parameter of fi 
with respect to v aa . 

The following result, which easily follows from (|2.4p . can be used to boost bounds 
on tail probabilities with respect to v aG (as those in Theorem 12. ip to obtain similar 
bounds on any uniformly z/ a , CT -absolutely continuous probability measure \x. 

Proposition 2.2. OL Vaa (p) is the largest nonnegative real number a for which 



3. Smoothed analysis for adversarial distributions. In this section we 
present our main result, namely an extension of Theorem 12.11 to the case where we 
have a radially symmetric distribution whose density has a pole at the point being 
perturbed. We begin by introducing some notation. 

Let a G P™ and a G (0,1], and let v at „ be the uniform measure on Bp(a, er), as 
defined in (|2.3[) . Let [i be a iv^-absolutely continuous probability measure on P n 
with density f(x). In other words, 



for all events B. Assume further that / : P™ — > [0, oo] is of the form f(x) = g(dp(x, a)), 
with a monotonically decreasing function g : [0, a] — > [0, oo] of the form 



with (3 < n, where C^ j(7 = I n {o~) / I n -p(o~) and h: [0, a] — > R+ is a continuous function 
satisfying h(0) ^ and 



so that /j, is a probability measure on Bp (a, a). In other words, / is radially symmetric 
around a with respect to dp and has a pole of order —j3 at in case > 0. The normal- 
izing factor C/3 jCr is chosen to make h(r) = 1 a valid choice. Set H := sup < r < cr h(r). 
Note that H > 1, and that H = 1 implies h = 1. 

It will be important to have expressions for v a .a{B) and /i(-B) when B = J5p(a, /?) 





g(r) = C 0>a ■ r 



P ■ h{r) 




A 



is a projective ball. In this situation we have 

n{B r {a,p)) = [ f{x)v{dx) 



1 



1 f p r n 

Un{c) Jo VI 



dr 



p r n-p-l 
h(r)-== dr (3.1) 



In-p{o) Jo VT 

In-p(p) 



Similarly, 



In particular, 



< sup h{r) . 

0<r<p / l n -p\V) 

In-p(p) 



p(B P (a,p)) > inf h(r) 



0<r<p J I n -p(a) 



v a AMa,P)) = rr\- (3-2) 
The main result of this note is the following. 

Theorem 3.1. Let c € be a conic condition number with set of ill-posed inputs 
£ C P n , and assume £ is contained in a projective hyper surface of degree at most d. 
Then 

EM < 2 In(n) + ln(d) + In (- )+ In f ^ + f In ■ 2 ' /? ~" 



in J ~ v; w V "/ V 2 / 1-|V ln(7m/2) 

This result applies to the variety of problems mentioned after Theorem 12.11 The 
statement of the Theorem follows from calculating the smoothness parameter a v (p) 
and the constants in Proposition 12.21 These are given by the following two lemmas, 
to be proven later. 

Lemma 3.2. The smoothness parameter of \i with respect to v a ^ is given by 
a »a, a (p) = 1 - P/ n - 

For the statement of the next Lemma, let e € (0, 1 — (3/n), and let 



Pe ■= a 



1 - 



7m / \ V 7rn 



Set 4 := I„(p E )/I n (a). 

Lemma 3.3. LetBCP n be such that u a<(7 (B) <5 £ . Then fi(B) < (i/ 0)tr (.B)) 1- » _e . 

We are now ready to prove the main result. 
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Proof of Theorem 13. H Setting e = |(1 - £) and using the bounds (|2.2p we 
obtain 



From Theorem [231 it follows that for all t > t := ln[(l + 2d)n/a], 

1 3r/« 

Prob{ln%f > t} < -—e- f . (3.4) 



Set 



< 13dn\ ( 13dn\ . , 

i e :=m(— j=m(^— )+i^/) 



Using (|3.3j) we obtain 



In 13— < t e a- In ; < In 13- 



^ ( 7rn ) 

The lower bound shows that t e > t , so that for all t>t e , 

v a . a ({x : lntf{x) > t}) = Prob{ln%f > t} < e~ l < 6 e . 



Applying Lemma [3~3l it follows that for t >t e , 



Prob{ln<*f > t} = n ({x : ln^(or) > t}) < 



/i3^ e _A 



*(!-#) 



and hence, 



poo 

E[ln^] = / Prob{ln<£? > t}dt 

, f°° (\Un \ , 
< y ldt + j ( e _t ! (// 



1-2 



Using the bounds on t e and 5 e we get 



A small calculation shows that f 1 — (^) " J < V infa»/2) • This completes the 
proof. □ 



Epntf] < 21n(n)+ln(d)+ln ( -)+ ln f^V^T I ln " V ' 



5L)» 
-Trn / 



3.1. Proofs of Lemmas 13.21 and 13.31 The content of the following Lemma, 
needed for calculating the smoothness parameter, should be intuitively clear. 



Lemma 3.4. Let < 8 < 1. Then among all measurable sets B C Bp(a, a) with 
< Va,o{B) < 8, p(B) is maximized by Bp(a, p) where p € (0, a) is chosen so that 
v atlT (Bf(a, p)) = 5. 

Proof. It clearly suffices to show that 



f(x) V a ,a{dx) < / f(x) V a ,a{dx) 

B JB r (a,p) 

for all Borel sets B C Bp(a, a) such that v a , a {B) = S. Indeed, we have 



f{x) v a . a (dx) = / f(x) v a . a {dx) + / f (x) v a .,j(dx) 

B JBnB F (a,p) JB\B F (a,p) 

< / f(x) v a ^ a (dx) + g(p) v a ^(B \ B F (a, pj) 

J BnB F (a,p) 

f(x) v a ,a(dx) + g(p) v a , a {B r (a, p) \ B) (3.5) 

BnB P (a,p) 



< / f(x) V a ^(dx) + / f{x) lS a ,<j(dx) 

' BnB P (a,p) JB r (a,p)\B 

f{x) V a ,a{dx), 

Bp(a,p) 

where we have used ^ aj(T (Bp(a, p)) = 8 = v a ,a(B) in (|3 . 5[) . This proves our claim. □ 
Even though p is a function of <5, we will not reflect this notationally in the sequel. 

PROOF of Lemma |3~21 From (|3.ip . (|3.2p and (|2.2p we get the bounds of the form 

7^ • p n < v a AMa,p)) < Ci ■ p n , (3.6) 
inf h(r) ■ i- ■ p n -P < p(B v (a, p)) < sup h(r) ■ C 2 ■ p n ~ , (3.7) 

0<r<p O2 0<r<p 

where the constants Ci do not depend on p. 
We thus have (using Lemma 



a Ua (/1) = lim inf /- ^ \ : B measurable, < v a a (B) < 8 

5^0 {\iiv a ^(B) 

= lim ln M B p( a >P)) 



p^o lni/ a , .(B P (a,p)) 

< ,. ln(inffe(r)/C 2 ) + (n-/3)lnp = -. _ p 

P~ *0 ln(Ci)+nln p — n 

> ln(C 2 -sup/i(r)) + (n-/3)lnp _ ^ _ j3_ 

This concludes the proof. □ 



Proof of Lemma 13.31 Since sets of the form Bp(a, p) maximise p,(B) among all 
measurable sets B C Bp(a, a) such that v aa (B) < (5 for any <5, we may w.l.o.g. assume 
B = Bp(a, p). By (|3.1[) and (|3.2p our task amounts to showing 



In-p{o) \I n (<r), 

for p < p £ . And indeed, using the bounds (|2.2[) . we get 



In-p(o) yjl- p 2 Vcr 

1 



n-/8 



(9 



< 



< 



V 7m / 



0^7 



2 \(l-#-<0/(ne) 



where for the last inequality we use the bounds 




again. Moreover, we have 



< Or < 



PS P 




Therefore, y / l-(^ r ) (1 » 



s " < \/l — P 2 , completing the proof. 



□ 
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